{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "ef7e0036",
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "import torch.nn as nn"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "269b4273",
   "metadata": {},
   "source": [
    "# 填充"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "68fb6a11",
   "metadata": {},
   "source": [
    "通常，如果我们添加$p_h$行填充（大约一半在顶部，一半在底部）和$p_w$列填充（左侧大约一半，右侧一半），则输出形状将为\n",
    "\n",
    "$$(n_h-k_h+p_h+1)\\times(n_w-k_w+p_w+1)。$$\n",
    "\n",
    "这意味着输出的高度和宽度将分别增加$p_h$和$p_w$。\n",
    "\n",
    "在许多情况下，我们需要设置$p_h=k_h-1$和$p_w=k_w-1$，使输入和输出具有相同的高度和宽度。\n",
    "这样可以在构建网络时更容易地预测每个图层的输出形状。假设$k_h$是奇数，我们将在高度的两侧填充$p_h/2$行。\n",
    "如果$k_h$是偶数，则一种可能性是在输入顶部填充$\\lceil p_h/2\\rceil$行，在底部填充$\\lfloor p_h/2\\rfloor$行。同理，我们填充宽度的两侧。\n",
    "\n",
    "卷积神经网络中卷积核的高度和宽度通常为奇数，例如1、3、5或7。\n",
    "选择奇数的好处是，保持空间维度的同时，我们可以在顶部和底部填充相同数量的行，在左侧和右侧填充相同数量的列。\n",
    "\n",
    "此外，使用奇数的核大小和填充大小也提供了书写上的便利。对于任何二维张量`X`，当满足：\n",
    "1. 卷积核的大小是奇数；\n",
    "2. 所有边的填充行数和列数相同；\n",
    "3. 输出与输入具有相同高度和宽度\n",
    "则可以得出：输出`Y[i, j]`是通过以输入`X[i, j]`为中心，与卷积核进行互相关计算得到的。\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "a0850bf8",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "torch.Size([8, 8])"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def comp_conv2d(conv2d, X):\n",
    "    X = X.reshape((1, 1) + X.shape)\n",
    "    Y = conv2d(X)\n",
    "    return Y.reshape(Y.shape[2:])\n",
    "\n",
    "conv2d = nn.Conv2d(1, 1, kernel_size=3, padding=1)\n",
    "X = torch.rand(8, 8)\n",
    "comp_conv2d(conv2d, X).shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "92093ba9",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "torch.Size([8, 8])"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "conv2d = nn.Conv2d(1, 1, kernel_size=(5, 3), padding=(2, 1)) # 填充 （5-1）/2 和 （3-1）/2\n",
    "comp_conv2d(conv2d, X).shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "192f0eeb",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "torch.Size([4, 4])"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "conv2d = nn.Conv2d(1, 1, kernel_size=3, padding=1, stride=2) # 8-3+1+2 / 2 = 4\n",
    "comp_conv2d(conv2d, X).shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "55010778",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "torch.Size([2, 2])"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "conv2d = nn.Conv2d(1, 1, kernel_size=(3, 5), padding=(0, 1), stride=(3, 4)) # 8 - 3 + 0 + 3 / 3 = 2, 8 - 5 + 1 + 4 / 3 = 2\n",
    "comp_conv2d(conv2d, X).shape"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "34a6814d",
   "metadata": {},
   "source": [
    "# 步幅"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dee1ed04",
   "metadata": {},
   "source": [
    "通常，当垂直步幅为$s_h$、水平步幅为$s_w$时，输出形状为\n",
    "\n",
    "$$\\lfloor(n_h-k_h+p_h+s_h)/s_h\\rfloor \\times \\lfloor(n_w-k_w+p_w+s_w)/s_w\\rfloor.$$\n",
    "\n",
    "如果我们设置了$p_h=k_h-1$和$p_w=k_w-1$，则输出形状将简化为$\\lfloor(n_h+s_h-1)/s_h\\rfloor \\times \\lfloor(n_w+s_w-1)/s_w\\rfloor$。\n",
    "更进一步，如果输入的高度和宽度可以被垂直和水平步幅整除，则输出形状将为$(n_h/s_h) \\times (n_w/s_w)$。"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python(d2l)",
   "language": "python",
   "name": "d2l"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.17"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
